Stochastic Grammatical Inference of TextDatabase

نویسنده

  • MATTHEW YOUNG-LAI
چکیده

For a document collection in which structural elements are identiied with markup, it is often necessary to construct a grammar retrospectively that constrains element nesting and ordering. This has been addressed by others as an application of grammatical inference. We describe an approach based on stochastic grammatical inference which scales more naturally to large data sets and produces models with richer semantics. We adopt an algorithm that produces stochastic nite automata and describe modiications that enable better interactive control of results. Our experimental evaluation uses four document collections with varying structure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Grammatical Inference with Multinomial Tests

We present a new statistical framework for stochastic grammatical inference algorithms based on a state merging strategy. We propose to use multinomial statistical tests to decide which states should be merged. This approach has three main advantages. First, since it is not based on asymptotic results, small sample case can be specifically dealt with. Second, all the probabilities associated to...

متن کامل

Learning Stochastic Context-Free Grammars from Corpora Using a Genetic Algorithm

A genetic algorithm for inferring stochastic context-free grammars from nite language samples is described. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. We describe a number of experiments in learning grammars for a range of formal languages. The results of these experiments are encouraging and compare very favour...

متن کامل

A study of Grammatical Inference Algorithms in Automatic Music Composition and Musical Style Recognition

A study of the application of Grammatical Inference (GI) in the field of Music is presented. We have studied three GI Algorithms which have been previously applied successfully in other fields. In this work, these algorithms have been used to learn a stochastic grammar for each of three different musical styles from examples of melodies. Then, each of the learned grammars was used to stochastic...

متن کامل

Recent Advances of Grammatical Inference

In this paper, we provide a survey of recent advances in the field “Grammatical Inference” with a particular emphasis on the results concerning the learnability of target classes represented by deterministic finite automata, context-free grammars, hidden Markov models, stochastic contextfree grammars, simple recurrent neural networks, and case-based representations.

متن کامل

Modelling Biological Sequences by Grammatical Inference

This document is a complement of the tutorial on Modelling Biological Sequences by Grammatical Inference organized for the tenth anniversary edition of the International Colloquium on Grammatical Inference (ICGI 2010) held in Valencia, Spain. The tutorial surveys the approaches related to grammatical inference which have been developed in Bioinformatics to model family of sequences, from well e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000